Entity-Based Keyword Search in Web Documents

نویسندگان

  • Enrico Sartori
  • Yannis Velegrakis
  • Francesco Guerra
چکیده

In document search, documents are typically seen as a flat list of keywords. To deal with the syntactic interoperability, i.e., the use of different keywords to refer to the same real world entity, entity linkage has been used to replace keywords in the text with a unique identifier of the entity to which they are referring. Yet, the flat list of entities fails to capture the actual relationships that exist among the entities, information that is significant for a more effective document search. In this work we propose to go one step further from entity linkage in text, and model the documents as a set of structures that describe relationships among the entities mentioned in the text. We show that this kind of representation is significantly improving the effectiveness of document search. We describe the details of the implementation of the above idea and we present an extensive set of experimental results that prove our point.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

Open Entity Extraction from Web Search Query Logs

In this paper we propose a completely unsupervised method for open-domain entity extraction and clustering over query logs. The underlying hypothesis is that classes defined by mining search user activity may significantly differ from those typically considered over web documents, in that they better model the user space, i.e. users’ perception and interests. We show that our method outperforms...

متن کامل

A Personalized Reordering System of Searched Web Pages

The commercial engines retrieved the appropriate documents by user’s keyword. Therefore, they reduce user’s effort and time to find appropriate information. However, the web page list retrieved from the engines is still not enough for user effort free. It is because the search target for each user is usually different according to user profile. Thus, it is need to reorder web page list accordin...

متن کامل

Ranking-Constrained Keyword Sequence Extraction from Web Documents

Given a large volume of Web documents, we consider problem of finding the shortest keyword sequences for each of the documents such that a keyword sequence can be rendered to a given search engine, then the corresponding Web document can be identified and is ranked at the first place within the results. We call this system as an Inverse Search Engine (ISE). Whenever a shortest keyword sequence ...

متن کامل

Proximity Keyword Search in Xml Documents Using CTREE Index

Proximity Keyword Search is especially useful when searching on the web and in long unstructured documents such as XML. This system is designed to handle novel features of Proximity Keyword Search in XML documents. It concentrates mainly on producing ranked results efficiently for keyword search queries over XML documents. The proposed system is first of its kind in which the keyword string is ...

متن کامل

Kokono Search: A Location Based Search Engine

We have developed a location-based search system for web documents on the Internet. This system can find web documents based on the distance between locations that are described in web documents and a location specified by a user. It consists of three modules. (1) A robot that gathers documents from the Internet, (2) a parser that extracts address strings from web documents and associates latit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Trans. Computational Collective Intelligence

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2016